Comprehensive integrated single-cell RNA sequencing analysis of brain metastasis and glioma microenvironment: Contrasting heterogeneity landscapes

Understanding the specific type of brain malignancy, source of brain metastasis, and underlying transformation mechanisms can help provide better treatment and less harm to patients. The tumor microenvironment plays a fundamental role in cancer progression and affects both primary and metastatic cancers. The use of single-cell RNA sequencing to gain insights into the heterogeneity profiles in the microenvironment of brain malignancies is useful for guiding treatment decisions. To comprehensively investigate the heterogeneity in gliomas and brain metastasis originating from different sources (lung and breast), we integrated data from three groups of single-cell RNA-sequencing datasets obtained from GEO. We gathered and processed single-cell RNA sequencing data from 90,168 cells obtained from 17 patients. We then employed the R package Seurat for dataset integration. Next, we clustered the data within the UMAP space and acquired differentially expressed genes for cell categorization. Our results underscore the significance of macrophages as abundant and pivotal constituents of gliomas. In contrast, lung-to-brain metastases exhibit elevated numbers of AT2, cytotoxic CD4+ T, and exhausted CD8+ T cells. Conversely, breast-to-brain metastases are characterized by an abundance of epithelial and myCAF cells. Our study not only illuminates the variation in the TME between brain metastasis with different origins but also opens the door to utilizing established markers for these cell types to differentiate primary brain metastatic cancers.


Introduction
The brain tissue environment exhibits significant distinctions from other tissues, making it uniquely poised to impact the low incidence of glioma brain metastasis, while also providing a more conducive environment for invasive cancers that reach the brain [1,2].Brain malignancies encompass two distinct categories of tumors: gliomas (GM) and brain metastasis (BM).GM refers to primary tumors that originate within the brain, whereas BM refers to secondary tumors that develop from primary tumors originating outside the brain, such as lung, breast, colorectal, and melanoma [3,4].Among central nervous system (CNS) cancers, BM stands as the most prevalent subtype [5][6][7], occurring at a high rate of 30% to 40% among individuals with cancer [8].Differentiating between BM and GM poses considerable challenges due to their similar features and origins.Precisely distinguishing between the two forms of cancer is crucial, as their distinct treatment techniques greatly impact patient outcomes.Detecting metastasis early can facilitate identifying initial lesions in asymptomatic individuals [9][10][11].The treatment approach for BM typically involves a combination of surgery, radiation therapy, and chemotherapy.The objective is to remove as much of the tumor as possible while preserving brain function and minimizing side effects.Additionally, targeted therapies might be employed in some cases for treating brain metastasis [12,13].While a combination of immune checkpoint blockade therapies has shown success in BM patients, not all individuals respond, presenting a therapeutic challenge [14].BM represents a grave medical condition, and its prognosis varies based on factors such as the primary cancer type, metastasis size and location, and the patient's overall health.The diagnosis, prognosis, and treatment selection for cancer are influenced by histological and phenotypic variances across different tumor types [15].Nonetheless, individuals afflicted with metastatic brain tumors form a diverse group, making it challenging to predict the prognosis based solely on the origin of the primary tumor [16].
Tumors consist of diverse cell types, including immune and stromal cells, as well as regulatory molecules that impact tumor growth dynamics.Collectively, these constituents give rise to an intricate tumor microenvironment (TME) [12].Moreover, the TME assumes a pivotal role in metastasis progression [17], and current research underscores its substantial influence on treatment response and clinical outcomes in cancer, warranting further investigation.Therefore, gaining a deeper insight into the TME landscape holds significant value.Recent studies have notably examined TME attributes in brain metastases, irrespective of their primary cancer source [3,14].
Utilizing single-cell RNA sequencing (ScRNA-seq) represents a pragmatic approach in comprehending human cancers.This methodology enables the identification of cancer diagnostic biomarkers by analyzing the transcriptomes of individual cells and exploring the diversity within tumor cells [1,12].Progress in devising novel therapeutic strategies for brain metastases associated with diverse cancer types has been gradual, with recent breakthroughs focusing on targeting metastatic subclones and discerning selective niches.Earlier investigations primarily centered on evaluating the heterogeneity of glioma tumors and brain metastasis through single-cell RNA sequencing [1,3,4,6,8,12,[17][18][19][20][21].
The brain's TME is now widely recognized as a pivotal regulator of cancer, offering potential avenues for innovative treatment approaches [22].Nevertheless, substantial research endeavors have recently been devoted to exploring the intricate interplay between TME heterogeneity and cancer progression.Many studies have delved into the diverse aspects of individual tumors' heterogeneity [23][24][25][26].
Given the critical significance of early brain metastasis detection to enhance treatment outcomes, alongside the need to differentiate between gliomas and brain metastasis, our present study leveraged the tumor microenvironment's heterogeneity and high-abundance cellular population markers to facilitate the identification and distinction of various brain tumor types.
To illuminate the intricate nature of the TME, we integrated single-cell RNA sequencing data from human gliomas and brain metastasis originating from lung (BM-lung) and breast (BMbreast) tissues.Subsequently, we undertook a comparative analysis of the cellular and subtype heterogeneity across these integrated datasets.This analytical approach yielded a comprehensive insight into the profiles of brain metastases, gliomas, and the prognosis of primary cancers.

Sample collection and clinical characteristics
We sourced single-cell transcriptome data from various datasets: breast-to-brain metastasis (GSE186344, GSE143423), lung-to-brain metastasis (GSE186344, GSE131907, GSE143423), and glioma tumors (GSE117891, GSE202371, GSE135045) via the Gene Expression Omnibus (GEO) database.All samples included in our study comprised both patients with wild-type and mutated forms, as well as a mix of both males and females.Our glioma samples encompassed all three types: Astrocytoma, Oligodendroglia, and Ependymoma, across 4 grades as per the WHO classification [27].The average age of glioma patients was 59 years.For patients with lung-to-brain and breast-to-brain metastases, the average ages were 57 and 53 years, respectively.Importantly, none of the surgical samples had undergone any prior treatment.Among patients with lung-to-brain metastasis, both histological types of Small Cell Lung Cancer (SCLC) and Non-Small Cell Lung Cancer (NSCLC) were represented.Similarly, among patients with breast-to-brain metastasis, both Triple-Negative Breast Cancer (TNBC) and Estrogen/Progesterone receptor positive (ER/PR+) types, including (human epidermal growth factor receptor 2) Her2-negative and Her2-positive subtypes, were included.All tumor tissue were obtained from the frontal lobe, cerebellum, parietal lobe, and temporal lobe.A total of 88,291 cells were acquired from 14 individuals utilizing the Chromium 10X Genomics scRNAseq protocol across all three tumor groups.Additionally, 10,035 tumor cells from three glioma tumor samples that employed the Drop-Seq protocol, as detailed in the GSE135045 study, underwent analysis.All data collected in this study were mapped to the GRCh38 human reference genome.

Processing of ScRNA-seq datasets
To uncover the common sources of biological variation we employed the integration tool for combining single-cell gene expression datasets using the R package Seurat version 4.1.We excluded cells of low quality, those with fewer than 200 expressed genes, and genes expressed in fewer than 3 cells from our analysis.Subsequently, cells with over 20% mitochondrial genes and those with excessively low or high gene counts were filtered out.Further, all datasets underwent standard preprocessing and were normalized through the LogNormalize method within the NormalizeData function of the Seurat package by a scale factor of 10000.Employing the variable feature variance stabilizing transformation method (selection.method= "vst"), we identified the top 2,000 highly variable genes across all samples using the FindVariableFeatures function of the Seurat package [28].

Data integration and analysis
We utilized the anchor strategy to integrate datasets.Initially, we extracted anchors from the datasets and then proceeded to integrate them using 2000 anchors derived from the accurately identified anchor set.Next, scaling and Principal Component analysis (PCA) was executed on the integration dataset.To heuristically estimate the dataset's dimensionality, we utilized Seurat's Score JackStraw and elbow plot functions (S1A and S1B Fig) .Subsequently, the first 35 principal components (PCs) were used to perform UMAP for nonlinear dimensional reduction.We applied a graph-based clustering approach from the Seurat package for clustering.Therefore we utilized the K-nearest neighbor (KNN) strategy with the FindNeighbors function and then used the FindClusters function to cluster with a resolution of 0.2.Also, other parameters set as Seurat default.Finally, we determined cell types by utilizing known markers for each cluster.

Differential Expression Analysis of Genes (DEGs)
We conducted a differential expression analysis of genes using the non-parametric Wilcoxon rank-sum test provided by Seurat's FindMarkers tool.This analysis aimed to identify genes that are expressed differently between the two groups of cells.For both cell groups, we set the minimum percentage (min.pct)threshold to 0.25.Additionally, we defined a threshold of 0.5 for the log fold-change in average expression (logfc.threshold) between the two cell groups.
To identify genes that were either upregulated or downregulated in each cluster, we calculated the average log fold-change values between that cluster and other clusters.This allowed us to quantify the extent of gene expression changes within each cluster

Cell population analysis in glioma and lung-to-brain metastases
In this study, we focused on BM originating from different types of cancers and compared these metastases with each other and GM.We utilized various human scRNA-seq datasets from GM and BM-lung, which were obtained from the GEO database.The details such as GEO ID, cancer type, scRNA-seq method, number of samples, and post-filtering cell count are presented in Fig 1 .To perform a comparative analysis between GM and BM-lung, we pre-processed the transcriptomes of 60,331 distinct tumor cells originating from 8 GM and 5 BM-lung samples.We investigated the microenvironmental landscape and immunological state within GM and BMlung.To achieve this, similar cell types across the various datasets were grouped together in a unified UMAP space, generated by integrating the datasets into a low-dimensional representation.The arrangement of each dataset within this integrated UMAP space is illustrated in  (MOG, CLDN11, MOBP), fibroblasts (DCN, THY1, COL1A1), oligodendrocyte precursor cells (OPC) (GFAP, SLC1A3, AQP4), dendritic cells (HLA-DQB1, HLA-DRB1, HLA-DPB1), endothelial cells (PECAM1, CLDN5, FLT1), and undetermined cells (depicted in Fig 2C) [6,21,[29][30][31][32][33][34][35][36].The top 100 up-regulated genes for each cluster are listed in S1 Data, based on the Differential Expression Analysis.Our findings highlight significant differences in cell population composition between BM-lung and GM.Notably, there were noteworthy distinctions in the abundance of immune cell types.Macrophages (30% in GM, 3% in BM-lung) and monocytes (4.7% in GM, 0.01% in BM-lung) were notably more abundant in GM, whereas BM-lung exhibited a higher frequency of CD4+ T (11.9% in BM-lung, 1.5% in GM) cells and CD8+ T cells (7.9% in BM-lung, 3% in GM).Moreover, the prevalence of CNS cells, including oligodendrocytes (13.6% in GM, 1.4% in BM-lung), astrocytes (9.3% in GM, 2.4% in BM-lung), and OPC (6.3% in GM, 0.17% in BM-lung), was more pronounced in GM.In contrast, epithelial cells, particularly of the AT2, were more prevalent in BM-lung (13% in BM-lung, 2.1% in GM) (S1 Table in S1

Comparative analysis of immune cell sub-clusters in glioma and lung-tobrain metastasis
To gain a deeper understanding of the immune cell subpopulations, we conducted targeted analyses.Specifically, we focused on macrophages and T cells within these subpopulations.By employing the FindClusters function, we isolated five distinct subclusters within macrophage cells.Through gene expression markers, we successfully delineated two primary cell types (illustrated in Fig 3A).In the brain's tumor microenvironment, tumor-associated macrophages (TAM) were found to exist in two sub-clusters: monocyte-driven macrophages (MDM) (identified by EMP3, ACP5, and LYZ markers) and microglia (MG) (characterized by CX3CR1, TMEM119, and P2RY12, markers) (depicted in Fig 3B) (S2A and S2B Fig) Among these sub-clusters, sub-clusters 2, 3, and 4 were dominated by MG and exhibited higher prevalence in GM.Conversely (67.3% in GM and 17.4% in BM-lung), subclusters 1 and 5 were identified as MDM and were more abundant in BM-lung (82.5% in BM-lung and 32.6% in GM) (S2 Table in S1 File).Also, percentages of each subclusters show in bar plot (Fig 3C).Furthermore, our investigation extended to CD4+ T cells, which were categorized into five sub-clusters using the FindClusters function.These sub-clusters encompassed naive CD4+ T cells (marked by TCF7, CCR7), cytotoxic CD4+ T cells (characterized by IFNG, GZMM, GZMA, GNLY), regulatory (Treg) CD4+ T cells (identified by TIGIT, CTLA4, IL2RA), and proliferating cells (indicated by MKI67, RRM2, TYMS) (displayed in Fig 3D).Notably, sub-clusters 2 and 3 exhibited a higher prevalence in BM-lung, and further analysis confirmed that both of these sub-clusters comprised cytotoxic CD4+ T cells) (48.2% in BM-lung and 7.

Validation of results
To assess the accuracy and practicality of our derived findings, we conducted a validation using two samples from the GEO database one GM sample (GSM3984326 from GSE135045) and one BM-lung sample (GSM6112137 from the GSE202371 dataset).These samples yielded a total of 88,291 cells from two individuals, which were subsequently processed and filtered down to 8,413 cells.Employing the same integration methodology outlined in the method section, the datasets were integrated.Post-integration, we scrutinized the expression of marker genes for various cell types.Remarkably, our validation results echoed the findings obtained from the comparison between BM-lung and glioma.The expression markers for cell types including T cells, exhausted CD8 + T cells, cytotoxic CD4+ T cells, macrophages, MG, and AT2 epithelial cells were evaluated.Notably, cytotoxic CD4+ T cells, exhausted CD8+ T cells, and AT2 epithelial cells exhibited higher abundance in BM-lung (Fig 6A).In contrast, macrophages, microglia, OPCs, and oligodendrocytes were more prevalent in GM (Fig 6B ).

Discussion
BM represents a pressing healthcare concern in oncology treatment.Given the bleak prognoses associated with primary cancers that have spread to the brain, especially lung and breast cancers, there is an urgent imperative to enhance our understanding of the underlying pathogenic mechanisms and uncover novel targets for immune-based therapies [16].It has become increasingly evident that the TME within the brain plays a pivotal role in shaping cancer progression and the efficacy of treatments, both for primary brain tumors and metastatic lesions.Mechanistic insights into the tumor-promoting activities of various constituents within the brain TME have unveiled multiple potential targets for therapeutic intervention [23,47].
In recent years, significant research endeavors have been dedicated to exploring the intricate interplay between the immune system and the TME in BM.This exploration has led to a paradigm shift, recognizing the CNS as an immunologically distinct domain, as opposed to an isolated one [22].Furthermore, advancements in scRNA-seq have facilitated a comprehensive analysis of tumor and immune microenvironment heterogeneity across various cancer types [15].To unravel the intricacies of the tumor microenvironment in both BM and GM, we meticulously analyzed extensive scRNA-seq datasets encompassing lung and breast-to-brain metastasis and compared the diversity of cellular constituents between these entities.
Our focus centered on profiling the distinct cell lineages that coalesce within tumors, encompassing immune cells, oligodendrocytes, endothelial cells, epithelial cells, and fibroblasts.Following cluster assignment, we unveiled a unique subset of AT2 epithelial cells within BM-lung, exhibiting a notably lower prevalence in glioma and BM-breast.Previous research has underscored the significance of AT2 cells as crucial players in lung cancer origin and their potential role in facilitating lung cancer metastasis [24,48].Hence, this specific cluster and its distinct gene expression markers (GPRC5A, NAPSA, and SLC34A2) hold promise as prognostic indicators for lung cancer, a prominent source of brain metastases [33,34].
Tumor cells can incite an immune response, leading to a complex equilibrium where diverse immune subtypes can either foster tumor progression, and metastasis, or confer resistance to treatments [49,50].Among the key elements in the microenvironment contributing to immune evasion are the expansion of pro-inflammatory macrophages and the malfunctioning of T cells.Macrophages, known for their role in maintaining tissue homeostasis, have garnered increasing attention in the tumor microenvironment of brain tumors.Interestingly, when influenced by cancer cells, macrophages tend to polarize into immune suppressor cells, thereby promoting an environment conducive to tumor growth and evasion of immune surveillance [22,51,52].
Our findings highlight the prevalence of myeloid cells, specifically monocytes and macrophages, as the predominant immune cell types in GM.Prior research has illuminated the dynamic roles played by myeloid cells in cancer, wherein their functions range from exhibiting anti-tumor activities to promoting tumor growth, contingent upon the cancer type and its stage [42,53].Furthermore, existing studies have pointed out distinctions between the uptakes rates of macrophages in BM compared to GM.Despite functional similarities between these two diseases, differences in their microenvironments persist [26,54].Given the pivotal role of TAMs in orchestrating tumor progression, targeted interventions are gaining traction as a strategy to disrupt the tumor-promoting activities of TAMs [54].In our sub-clustering analysis of TAMs within GM, BM-lung, and BM-breast, noteworthy patterns emerged.Specifically, the MG subtype exhibited higher prevalence in GM compared to BM-lung and BM-breast.In contrast, the MDM subtype was more abundant in BM-lung and BM-breast.Prior investigations have established that MDM is implicated in processes such as antigen presentation, immune suppression, and wound healing, whereas MG is associated with tasks related to host defense mechanisms and maintenance activities like synaptic pruning [26,37,55].
Considering the distinct roles played by each of the two subtypes in BM and GM, comprehending the presence of their respective populations within these conditions can significantly assist in tailoring treatments based on the immune cell landscape.In the transcriptional profiles of BM-breast and GM, we have identified elevated frequencies of epithelial and fibroblast cell types.Conversely, in GM cases, heightened frequencies were observed among myeloid cells, B cells, and oligodendrocytes.Earlier studies on BM have indicated that patients with BM-breast exhibit increased frequencies of macrophages, whereas cases of lung cancer demonstrated higher T cell frequencies [8].
Consistent with these prior findings and in alignment with our discoveries, the immune cell clusters and frequencies within BM differ based on the cancer's point of origin.These disparities have the potential to influence targeted therapeutic approaches for BM [8,21].Upon closer examination of the immune cell subgroups, we have noted that elevated macrophage levels in GM, rather than BM, are linked to MG. MG, often referred to as brain macrophages, play a crucial role in immune regulation and the elimination of tumors [8].Sub-analysis of the T cell landscape reveals that T cells (both CD4+ and CD8+) dominate the immune cell composition in BM-lung compared to GM.When investigating the sub-clusters within T cell lymphocytes in BM-lung and GM, it becomes apparent that cytotoxic CD4 T cells (characterized by GZMM, GZMA, GZMB, IFNG, and GNLY) and exhausted CD8+ T cells (marked by CD8A, CTLA4, LAG3, and TIGIT) [33] exhibit a high frequency in BM-lung.Previous studies on brain metastasis have shown a higher population of leukocytes in brain metastasis than in CNS-endogenous cancers [55][56][57][58].
Within the realm of immunotherapies, the establishment of a lasting response is limited to a subset of patients, primarily due to the tumor microenvironment's suppressive effects on the immune system.It's worth noting that CD4+ T cells identify distinct surface markers compared to CD8+ T cells, and given that cancer cells generally lack MHC-II expression, CD4+ T cells demonstrate effectiveness in exerting tumor suppression through interactions with stromal cell surface markers.For instance, these interactions involve macrophages, B cells, and the release of cytokines to facilitate CD8+ T cell activation.Post antigen encounter, a significant portion of T effector cells undergo apoptosis, leading to the expansion of the exhausted T cell phenotype within cytotoxic effector populations in CD8+ T cell subsets [31,52,59,60].
CD8+ T cells function as cytotoxic lymphocytes responsible for detecting and responding to infections.Following pathogen elimination, effector CD8+ T cells transition into memory cells that provide protection upon subsequent exposure to antigens.However, in the context of malignancy, effector CD8+ T lymphocytes experience exhaustion, resulting in diminished proliferative and cytotoxic capabilities [61][62][63].Recent investigations underscore that T cell exhaustion and functional impairment within the TME are fundamental features of various malignancies.These alterations in immune cell populations contribute to the transformation of the tumor milieu into an immunosuppressive environment effectively establishing immunosuppressive conditions within the tumor tissue [31,59].Collectively, this data highlights the substantial role played by the tumor's origin in shaping the specific characteristics of the brain metastatic tumor microenvironment.
Cancer-associated fibroblasts (CAFs) constitute the predominant component of the tumor microenvironment.As the tumor grows, the healthy breast matrix undergoes disruption, marked by a reduction in the number of healthy fibroblasts and their conversion into CAFs.CAFs release various cytokines, chemokines, extracellular matrix regulatory molecules, components of the extracellular matrix, and inflammatory mediators, collectively promoting tumor cell proliferation, invasion, metastasis, and evasion from immune surveillance [37,39].These CAFs contribute to the TME by generating enzymes that crosslink the matrix and proteases that degrade the extracellular matrix, significantly contributing to increased tissue stiffness and facilitating metastasis [64,65].
Numerous studies have confirmed the presence of diverse phenotypic and functional CAF populations within a single tumor, observed both in vivo and in vitro, along with single-cell analyses of pre-sorted CAFs [64,[66][67][68][69][70].Importantly, this study represents the first attempt to compare CAF subpopulations between BM-breast and GM.Through our analysis of singlecell RNA sequencing data, we have identified six distinct CAF subtypes based on their transcriptome profiles.The allocation of cells within these sub-clusters reveals higher levels of apCAFs and tpCAFs in GM, while myCAFs predominate in BM-breast.Previous research has unveiled that myCAFs actively produce a range of matrix components and participate in matrix remodeling.They also secrete cytokines and chemokines, along with inflammatory factors that facilitate tumor cell adhesion and migration.On the other hand, tpCAFs express metalloproteinases and matrix proteins, which contribute to extracellular matrix remodeling [45].Additionally, dCAFs exhibit a more specialized expression pattern, specializing in the production of basement membrane components and paracrine signaling molecules [41,71].Earlier investigations have indicated that GM do not exhibit a high abundance of CAFs; however, they do accumulate within the GM microenvironment.These CAFs develop a more mesenchymal phenotype, contributing to enhanced migratory and invasive behaviors in malignant cells [72][73][74].
Chemotherapy is a frequent treatment for cancers and can significantly impact the disease process.Cancer cells' treatment sensitivity is heavily influenced by their interactions with the tumor microenvironment (TME), especially immune cells.There's currently significant interest in targeting stromal cells in cancer therapy [75][76][77].It's crucial to accurately assess the composition of stromal cells in tumors, replicate the diverse characteristics seen in human tumors in clinical models, and understand how this diversity impacts treatment efficacy, drug responses, and resistance [78,79].Clinical studies indicate a correlation between CD4, CD8, and macrophage levels in the tumor microenvironment and treatment outcomes.The immune system plays an active role in illness, potentially affecting clinical responses and resistance to treatment.An abundance of macrophages can hinder therapeutic effectiveness.For instance, in patients with node-positive breast cancer who underwent intensive chemotherapy, those with tumors exhibiting high levels of macrophages, high CD4 T-cells, but low CD8 T-cells, experienced significantly lower recurrence-free survival rates compared to those with tumors showing low macrophage levels, low CD4 T-cells, and high CD8 T-cells.In addition, Cytotoxic T-cells play a positive role in treatment effectiveness [80][81][82].
Furthermore, cancer-associated fibroblasts (CAFs) play a crucial role in the treatment process due to their increased proliferation, enhanced extracellular matrix production, and unique cytokine secretion compared to normal tissue fibroblasts.Early co-culture studies suggested that injured or irradiated fibroblasts might promote cancer cell proliferation more effectively than non-irradiated fibroblasts, indicating that within a solid tumor, fibroblasts are not passive entities and could potentially influence therapy outcomes [81,83,84].
Researchers identified components produced by normal human fibroblasts in a preclinical model of genotoxic damage, and discovered that WNT-16b might drastically restrict tumor response through paracrine signaling.In a WNT-dependent way, elevated amounts of this ligand promoted the growth of cancer cells and produced a mesenchymal phenotype.The responsiveness of fibroblasts to chemotherapy was enhanced by removing WNT-16b.Stromal fibroblasts secreted WNT-16b through an NF-κB-mediated pathway linked to inflammation and stress.Because stress-response programs in stromal cells might decrease treatment efficacy by providing a protective environment for cancer cells, the supporting stroma's reaction to therapy may be more complicated [84].
The study has several limitations, including potential variability from integrating single-cell RNA sequencing datasets from different sources such as lung-to-brain metastasis, breast-tobrain metastasis, and gliomas, which may introduce batch effects despite using the anchor technique for data harmonization.Additionally, the sample size of both patients and cells analyzed may limit the statistical power and comprehensive representation of heterogeneity within gliomas and brain metastases.Furthermore, the findings, based on a specific patient cohort, may not fully represent broader populations or diverse clinical settings, thus limiting the generalizability of the results.The identified cell types and their associations with disease progression or therapeutic response are observational and require further validation through functional studies and clinical trials.

Conclusion
In conclusion, the prevalence of distinct cell types exhibiting varying population proportions in BM arising from different primary cancers holds the potential for aiding in the prediction of the primary brain cancer type.Our findings underscore the significance of macrophages as abundant and crucial elements within GM.Conversely, MB-lung exhibit elevated populations of AT2 cells, cytotoxic CD4+ T cells, and exhausted CD8+ T cells.Meanwhile, BM-breast are characterized by an abundance of epithelial cells and myCAFs.Our study not only sheds light on the heterogeneity of the TME between BM-lung and BM-breast cases but also introduces the possibility of leveraging well-known markers for these cell types to distinguish primary brain metastatic cancers.Looking ahead, it remains imperative to discern the nuanced variations in microenvironmental composition across diverse brain tumor subtypes to attain a comprehensive comprehension of tumor biology.Consequently, enhancing our understanding of the tumor microenvironment contributes to the identification of primary tumors within brain metastasis scenarios, thereby facilitating the selection of more tailored treatment approaches based on the originating primary cancers.

Fig 2 .
Fig 2. Transcriptional landscape of BM-lung and GM using 55,749 cells.(A) UMAP plot displaying 55,749 high-quality cells from 13 samples of BM-lung and GM, color-coded based on their original datasets.(B) UMAP plot presenting 55,749 high-quality cells from 13 samples of BMlung and GM, color-coded by their clusters.(C) Assignment of cell types to clusters based on gene marker expression patterns in BM-lung and GM.(D) Dot plots illustrating conserved and cell-type-specific markers in BM-lung and GM.(E) Heatmap of gene expression levels of topranking marker genes in 15 different clusters.(F) Pie charts representing the percentage of each cell type in BM-lung and GM.https://doi.org/10.1371/journal.pone.0306220.g002 File).A distinct cluster contained cell types that could not be definitively determined.Canonical markers for each cell type in BM-lung and GM are depicted in Fig 2C, and determined clusters show in UMAP space in Fig 2D.The 10 top genes expressed in each clusters show in heatmaps (Fig 2E).The distribution of cell percentages in BM-lung and GM is visualized in Fig 2F.

Fig 3 .
Fig 3. Sub-clustering of immune cells in the transcriptional landscape of BM-lung and GM.(A) Sub-clustering of macrophages and the identification of two clusters of microglia and MDM (UMAP plot).(B) Feature plot displaying the marker expression of three marker genes for each sub-cluster of macrophages.(C) Distribution of macrophage sub-clusters between GM and BM-lung in bar plot.(D) Sub-clustering of CD4+ T cells depicted in the UMAP plot.(E) Feature plot illustrating the marker expression of three marker genes for each sub-cluster of CD4+ T cells.(F) Distribution of CD4+ T cell sub-clusters between GM and BM-lung in bar plot.(G) Sub-clustering of CD8+ T cells shown in the UMAP plot.(H) Feature plot displaying the marker expression of three marker genes for each sub-cluster of CD8+ T cells.(I) Distribution of CD8+ T cells sub-clusters between GM and BM-lung in bar plot.https://doi.org/10.1371/journal.pone.0306220.g003 The 10 top genes expressed in each clusters show in heatmaps (Fig 4E).The distribution of cell percentages in BM-breast and GM is displayed in Fig 4F (S5Table in S1 File).

Fig 4 .
Fig 4. Transcriptional landscape of BM-breast and GM using 69,785 cells.(A) UMAP plot displaying 69,785 cells from 12 samples of BMbreast and GM, color-coded according to their original datasets.(B) UMAP plot presenting 55,749 high-quality cells from 12 samples of BMbreast and GM, color-coded by their clusters.(C) Assignment of cell types to clusters based on gene marker expression patterns in BM-breast and GM.(D) Dot plots illustrating conserved and cell-type-specific markers in BM-breast and GM.(E) Heatmap of gene expression levels of top-ranking marker genes in 12 different clusters.(F) Pie charts representing the percentage of each cell type in BM-breast and GM.https://doi.org/10.1371/journal.pone.0306220.g004

Fig 5 .
Fig 5. Sub-clustering of macrophage and fibroblast cells in the transcriptional landscape of BM-breast and GM.(A) UMAP plot displaying the subclustering of macrophages and the identification of two clusters of MG and MDM.(B) Feature plot illustrating the marker expression of three marker genes for each sub-cluster of macrophages.(C)Distribution of CD4+ T cell sub-clusters between GM and BM-breast in bar plot.(D) UMAP plot showcasing the subclustering of CAFs.(E) Feature plot displaying canonical marker genes for each sub-cluster of CAFs.(F) Distribution of CAFs cell sub-clusters between GM and BM-breast in bar plot.https://doi.org/10.1371/journal.pone.0306220.g005

Fig 6 .
Fig 6.Expression levels of canonical marker genes for cell types in BM-lung and GM test samples.(A) Expression levels of canonical marker genes with higher expression in BM-lung.(B) Expression levels of canonical marker genes with higher expression in GM. https://doi.org/10.1371/journal.pone.0306220.g006